Honor `DAP-Auth-Token` or `Authorization: Bearer` #1306

tgeoghegan · 2023-04-27T00:08:41Z

When acting as an HTTP server (i.e., helpers handling aggregation job or aggregate share messages, or leaders handling collection requests), Janus will now check for the authentication token in the more standard Authorization: Bearer <token> format defined in RFC 6750, as it already does in the aggregator API. If that header is absent, it falls back to checking DAP-Auth-Token for compatibility with existing peers. Nothing changes about the format or handling of the authentication token itself, which remains an opaque blob o' bytes.

Part of #472

divergentdave · 2023-04-27T16:37:26Z

core/src/task.rs

+        Ok(format!(
+            "Bearer {}",
+            str::from_utf8(self.as_ref())
+                .context("failed to covert authentication token to UTF-8")?


We should update aggregator_core/src/task.rs:<Task as TryFrom<SerializedTask>>::try_from() to check that authentication tokens are valid UTF-8. Currently it only checks that they are valid HTTP header values. (alternately, we could construct a Vec<u8> here)

I'm now taking a look at the places where we generate and store auth tokens across the aggregator API and the Janus API and TBH it's a bit of a mess of (unintentional?) double-base64-encoding. I'm going to take a beat to try to untangle this and separate what an auth token is (opaque blob of bytes) vs. how it's represented for transport via HTTP headers or k8s secrets or YAML or environment variables.

My understanding of AuthenticationToken was that it stored the token bytes directly, rather than an additionally-encoded form of the token. The reason I have this understanding is that the code that reads authentication tokens out of the DB places the bytes it reads directly into an AuthenticationToken with no further encoding/decoding, and there is no good reason for the DB to store an encoded form of the token.

For that reason, I think we want to base64-encode here.

Yes, I agree, and symmetrically, extract_bearer_token must base64 decode, since this particular layer of base64 encoding is a detail of how we transport the token in the context of an authorization: Bearer header.

Where this gets confusing is that the impl Distribution<AuthenticationToken> for Standard constructs AuthenticationTokens by generating 16 random bytes, then encodes that in Base64URL-no-padding, then the bytes of that string are the contents of the AuthenticationToken. So this worked because the bytes of the authentication token were themselves a valid UTF-8 and Base64 string. I'm not sure if it makes sense to generate AuthenticationToken that way (at a minimum it's a less efficient way to store 16 bytes of entropy) but that's orthogonal to the goals of this change, so I won't mess with it.

Bonus: that makes this method infallible.

Yeah -- I think the reason we made that choice was because the semantics of DAP-Auth-Token are "there is a correct token that must match for authz to pass, we pass it back and forth over an HTTP header" (which implies it must fit into an HTTP header, i.e. it is ASCII). So we chose to generate a random token by base64-encoding some random bytes, which is a handy way to create a token with a given amount of entropy that fits the ASCII requirement -- but in this case, the actual token is the base64-encoded string, not the underlying bytes that were randomly generated. And any process that could encode an ASCII string from an arbitrary byte buffer would work equally well, e.g. we could have hex-encoded, base32-encoded, etc.

Now that we're using Authorization: Bearer, the semantics are defined somewhat differently: we must send data that is at least shaped like base64-encoded data (RFC). Since our authorization tokens are actually arbitrary byte strings, we need to base64-encode here.

Once we have migrated away from supporting DAP-Auth-Token at all, or refactored things such that we can generate auth tokens per-authorization-method, we can change our encoding. It's a bit unfortunate we defined DAP-Auth-Token so loosely--maybe we should change our encoding scheme to keep things from getting confusing in practice? Or maybe we should support receiving auth tokens on a per-authorization-scheme basis, to avoid locking ourselves into the pattern that every authorization scheme must work with the same set of arbitrary bytes (or, alternatively, that our generated tokens must fit the requirements of every authorization scheme)?

(I'm going to send a PR to change randomly-generated authorization tokens to be shaped like hex-encoded data rather than base64-encoded data, as the current situation is going to get very confusing as we implement authorization schemes that require base64-encoding the bearer token.)

edit: #1308

divergentdave · 2023-04-27T16:44:03Z

core/src/http.rs

+        if let Some(received_token) = authorization_value.as_ref().strip_prefix(b"Bearer ") {
+            return Ok(Some(AuthenticationToken::from(received_token.to_vec())));
+        } else {
+            return Err(anyhow!("authorization header value is not a bearer token"));


I think this code could be simplified if Authorization header values with the wrong prefix were treated the same as missing headers, and this returned None in both cases. Some of the call sites already treat both outcomes the same, while others have two different paths that ultimately result in 403 status codes.

I don't think I agree: it's important to distinguish between the case where the authorization header is absent (in which case we want janus_aggregator to fall back to DAP-Auth-Token) and the case where the header is present but malformed (in which case we don't want to risk concealing a bug in the requestor by falling back to DAP-Auth-Token). I do notice I'm missing a test to verify that presenting a malformed authorization header causes a failure

branlwyd · 2023-04-27T17:50:45Z

core/src/task.rs

+        Ok(format!(
+            "Bearer {}",
+            str::from_utf8(self.as_ref())
+                .context("failed to covert authentication token to UTF-8")?


My understanding of AuthenticationToken was that it stored the token bytes directly, rather than an additionally-encoded form of the token. The reason I have this understanding is that the code that reads authentication tokens out of the DB places the bytes it reads directly into an AuthenticationToken with no further encoding/decoding, and there is no good reason for the DB to store an encoded form of the token.

For that reason, I think we want to base64-encode here.

divergentdave · 2023-04-27T18:53:37Z

aggregator_api/src/lib.rs

+        }
+    };
+
+    let decoded = match STANDARD.decode(bearer_token) {


This should be removed now that extract_bearer_token() does base64 decoding

branlwyd · 2023-04-27T21:12:27Z

aggregator/src/aggregator/aggregate_init_tests.rs

+    test_case
+}
+
+async fn setup_aggregate_init_test_stage_1() -> AggregationJobInitTestCase {


Q: what does stage_1 mean here? could this be given a better name?

I renamed it to clarify its purpose.

When acting as an HTTP server (i.e., helpers handling aggregation job or aggregate share messages, or leaders handling collection requests), Janus will now check for the authentication token in the more standard `Authorization: Bearer <token>` format defined in RFC 6750, as it already does in the aggregator API. If that header is absent, it falls back to checking `DAP-Auth-Token` for compatibility with existing peers. Nothing changes about the format or handling of the authentication token itself, which remains an opaque blob o' bytes. Part of #472

tgeoghegan requested a review from a team as a code owner April 27, 2023 00:08

tgeoghegan mentioned this pull request Apr 27, 2023

Revisit request authentication #472

Closed

divergentdave reviewed Apr 27, 2023

View reviewed changes

tgeoghegan force-pushed the timg/authn-accept-either-token branch from 1bb53ff to 07b68a9 Compare April 27, 2023 17:43

branlwyd requested changes Apr 27, 2023

View reviewed changes

tgeoghegan requested review from divergentdave and branlwyd April 27, 2023 18:18

divergentdave reviewed Apr 27, 2023

View reviewed changes

tgeoghegan force-pushed the timg/authn-accept-either-token branch from 8cf56bc to 7a9be6b Compare April 27, 2023 21:10

tgeoghegan requested a review from divergentdave April 27, 2023 21:10

branlwyd approved these changes Apr 27, 2023

View reviewed changes

tgeoghegan force-pushed the timg/authn-accept-either-token branch from e3d70bd to bc8cf98 Compare April 27, 2023 21:29

tgeoghegan merged commit d3594e8 into main Apr 27, 2023

tgeoghegan deleted the timg/authn-accept-either-token branch April 27, 2023 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Honor `DAP-Auth-Token` or `Authorization: Bearer` #1306

Honor `DAP-Auth-Token` or `Authorization: Bearer` #1306

tgeoghegan commented Apr 27, 2023

divergentdave Apr 27, 2023

tgeoghegan Apr 27, 2023

branlwyd Apr 27, 2023

tgeoghegan Apr 27, 2023 •

edited

Loading

branlwyd Apr 27, 2023

branlwyd Apr 27, 2023 •

edited

Loading

divergentdave Apr 27, 2023

tgeoghegan Apr 27, 2023

branlwyd Apr 27, 2023

divergentdave Apr 27, 2023

branlwyd Apr 27, 2023

tgeoghegan Apr 27, 2023

Honor DAP-Auth-Token or Authorization: Bearer #1306

Honor DAP-Auth-Token or Authorization: Bearer #1306

Conversation

tgeoghegan commented Apr 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgeoghegan Apr 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

branlwyd Apr 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Honor `DAP-Auth-Token` or `Authorization: Bearer` #1306

Honor `DAP-Auth-Token` or `Authorization: Bearer` #1306

tgeoghegan Apr 27, 2023 •

edited

Loading

branlwyd Apr 27, 2023 •

edited

Loading